Pesquisa | Portal Regional da BVS

1.

Size of the protein-protein energy funnel in crowded environment.

Jenkins, Nathan W; Kundrotas, Petras J; Vakser, Ilya A.

Front Mol Biosci ; 9: 1031225, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-36425657

RESUMO

Association of proteins to a significant extent is determined by their geometric complementarity. Large-scale recognition factors, which directly relate to the funnel-like intermolecular energy landscape, provide important insights into the basic rules of protein recognition. Previously, we showed that simple energy functions and coarse-grained models reveal major characteristics of the energy landscape. As new computational approaches increasingly address structural modeling of a whole cell at the molecular level, it becomes important to account for the crowded environment inside the cell. The crowded environment drastically changes protein recognition properties, and thus significantly alters the underlying energy landscape. In this study, we addressed the effect of crowding on the protein binding funnel, focusing on the size of the funnel. As crowders occupy the funnel volume, they make it less accessible to the ligands. Thus, the funnel size, which can be defined by ligand occupancy, is generally reduced with the increase of the crowders concentration. This study quantifies this reduction for different concentration of crowders and correlates this dependence with the structural details of the interacting proteins. The results provide a better understanding of the rules of protein association in the crowded environment.

2.

Dockground resource for protein recognition studies.

Collins, Keeley W; Copeland, Matthew M; Kotthoff, Ian; Singh, Amar; Kundrotas, Petras J; Vakser, Ilya A.

Protein Sci ; 31(12): e4481, 2022 12.

Artigo em Inglês | MEDLINE | ID: mdl-36281025

RESUMO

Structural information of protein-protein interactions is essential for characterization of life processes at the molecular level. While a small fraction of known protein interactions has experimentally determined structures, computational modeling of protein complexes (protein docking) has to fill the gap. The Dockground resource (http://dockground.compbio.ku.edu) provides a collection of datasets for the development and testing of protein docking techniques. Currently, Dockground contains datasets for the bound and the unbound (experimentally determined and simulated) protein structures, model-model complexes, docking decoys of experimentally determined and modeled proteins, and templates for comparative docking. The Dockground bound proteins dataset is a core set, from which other Dockground datasets are generated. It is devised as a relational PostgreSQL database containing information on experimentally determined protein-protein complexes. This report on the Dockground resource describes current status of the datasets, new automated update procedures and further development of the core datasets. We also present a new Dockground interactive web interface, which allows search by various parameters, such as release date, multimeric state, complex type, structure resolution, and so on, visualization of the search results with a number of customizable parameters, as well as downloadable datasets with predefined levels of sequence and structure redundancy.

Assuntos

Proteínas , Software , Proteínas/química , Simulação por Computador , Ligação Proteica , Simulação de Acoplamento Molecular , Conformação Proteica , Biologia Computacional/métodos

3.

Docking-based long timescale simulation of cell-size protein systems at atomic resolution.

Vakser, Ilya A; Grudinin, Sergei; Jenkins, Nathan W; Kundrotas, Petras J; Deeds, Eric J.

Proc Natl Acad Sci U S A ; 119(41): e2210249119, 2022 10 11.

Artigo em Inglês | MEDLINE | ID: mdl-36191203

RESUMO

Computational methodologies are increasingly addressing modeling of the whole cell at the molecular level. Proteins and their interactions are the key component of cellular processes. Techniques for modeling protein interactions, thus far, have included protein docking and molecular simulation. The latter approaches account for the dynamics of the interactions but are relatively slow, if carried out at all-atom resolution, or are significantly coarse grained. Protein docking algorithms are far more efficient in sampling spatial coordinates. However, they do not account for the kinetics of the association (i.e., they do not involve the time coordinate). Our proof-of-concept study bridges the two modeling approaches, developing an approach that can reach unprecedented simulation timescales at all-atom resolution. The global intermolecular energy landscape of a large system of proteins was mapped by the pairwise fast Fourier transform docking and sampled in space and time by Monte Carlo simulations. The simulation protocol was parametrized on existing data and validated on a number of observations from experiments and molecular dynamics simulations. The simulation protocol performed consistently across very different systems of proteins at different protein concentrations. It recapitulated data on the previously observed protein diffusion rates and aggregation. The speed of calculation allows reaching second-long trajectories of protein systems that approach the size of the cells, at atomic resolution.

Assuntos

Simulação de Dinâmica Molecular , Proteínas , Algoritmos , Fenômenos Biofísicos , Cinética , Método de Monte Carlo

4.

GWYRE: A Resource for Mapping Variants onto Experimental and Modeled Structures of Human Protein Complexes.

Malladi, Sukhaswami; Powell, Harold R; David, Alessia; Islam, Suhail A; Copeland, Matthew M; Kundrotas, Petras J; Sternberg, Michael J E; Vakser, Ilya A.

J Mol Biol ; 434(11): 167608, 2022 06 15.

Artigo em Inglês | MEDLINE | ID: mdl-35662458

RESUMO

Rapid progress in structural modeling of proteins and their interactions is powered by advances in knowledge-based methodologies along with better understanding of physical principles of protein structure and function. The pool of structural data for modeling of proteins and protein-protein complexes is constantly increasing due to the rapid growth of protein interaction databases and Protein Data Bank. The GWYRE (Genome Wide PhYRE) project capitalizes on these developments by advancing and applying new powerful modeling methodologies to structural modeling of protein-protein interactions and genetic variation. The methods integrate knowledge-based tertiary structure prediction using Phyre2 and quaternary structure prediction using template-based docking by a full-structure alignment protocol to generate models for binary complexes. The predictions are incorporated in a comprehensive public resource for structural characterization of the human interactome and the location of human genetic variants. The GWYRE resource facilitates better understanding of principles of protein interaction and structure/function relationships. The resource is available at http://www.gwyre.org.

Assuntos

Mapeamento de Interação de Proteínas , Proteínas , Software , Sítios de Ligação , Biologia Computacional/métodos , Bases de Dados de Proteínas , Humanos , Simulação de Acoplamento Molecular , Ligação Proteica , Mapeamento de Interação de Proteínas/métodos , Proteínas/química

5.

DOCKGROUND membrane protein-protein set.

Kotthoff, Ian; Kundrotas, Petras J; Vakser, Ilya A.

PLoS One ; 17(5): e0267531, 2022.

Artigo em Inglês | MEDLINE | ID: mdl-35580077

RESUMO

Membrane proteins are significantly underrepresented in Protein Data Bank despite their essential role in cellular mechanisms and the major progress in experimental protein structure determination. Thus, computational approaches are especially valuable in the case of membrane proteins and their assemblies. The main focus in developing structure prediction techniques has been on soluble proteins, in part due to much greater availability of the structural data. Currently, structure prediction of protein complexes (protein docking) is a well-developed field of study. However, the generic protein docking approaches are not optimal for the membrane proteins because of the differences in physicochemical environment and the spatial constraints imposed by the membranes. Thus, docking of the membrane proteins requires specialized computational methods. Development and benchmarking of the membrane protein docking approaches has to be based on high-quality sets of membrane protein complexes. In this study we present a new dataset of 456 non-redundant alpha helical binary interfaces. The set is significantly larger and more representative than the previously developed sets. In the future, it will become the basis for the development of docking and scoring benchmarks, similar to the ones for soluble proteins in the Dockground resource http://dockground.compbio.ku.edu.

Assuntos

Benchmarking , Proteínas de Membrana , Biologia Computacional/métodos , Bases de Dados de Proteínas , Simulação de Acoplamento Molecular , Ligação Proteica , Software

6.

Dockground scoring benchmarks for protein docking.

Kotthoff, Ian; Kundrotas, Petras J; Vakser, Ilya A.

Proteins ; 90(6): 1259-1266, 2022 06.

Artigo em Inglês | MEDLINE | ID: mdl-35072956

RESUMO

Protein docking protocols typically involve global docking scan, followed by re-ranking of the scan predictions by more accurate scoring functions that are either computationally too expensive or algorithmically impossible to include in the global scan. Development and validation of scoring methodologies are often performed on scoring benchmark sets (docking decoys) which offer concise and nonredundant representation of the global docking scan output for a large and diverse set of protein-protein complexes. Two such protein-protein scoring benchmarks were built for the Dockground resource, which contains various datasets for the development and testing of protein docking methodologies. One set was generated based on the Dockground unbound docking benchmark 4, and the other based on protein models from the Dockground model-model benchmark 2. The docking decoys were designed to reflect the reality of the real-case docking applications (e.g., correct docking predictions defined as near-native rather than native structures), and to minimize applicability of approaches not directly related to the development of scoring functions (reducing clustering of predictions in the binding funnel and disparity in structural quality of the near-native and nonnative matches). The sets were further characterized by the source organism and the function of the protein-protein complexes. The sets, freely available to the research community on the Dockground webpage, present a unique, user-friendly resource for the developing and testing of protein-protein scoring approaches.

Assuntos

Benchmarking , Proteínas , Simulação de Acoplamento Molecular , Ligação Proteica , Conformação Proteica , Proteínas/química

7.

Text mining for modeling of protein complexes enhanced by machine learning.

Badal, Varsha D; Kundrotas, Petras J; Vakser, Ilya A.

Bioinformatics ; 37(4): 497-505, 2021 05 01.

Artigo em Inglês | MEDLINE | ID: mdl-32960948

RESUMO

MOTIVATION: Procedures for structural modeling of protein-protein complexes (protein docking) produce a number of models which need to be further analyzed and scored. Scoring can be based on independently determined constraints on the structure of the complex, such as knowledge of amino acids essential for the protein interaction. Previously, we showed that text mining of residues in freely available PubMed abstracts of papers on studies of protein-protein interactions may generate such constraints. However, absence of post-processing of the spotted residues reduced usability of the constraints, as a significant number of the residues were not relevant for the binding of the specific proteins. RESULTS: We explored filtering of the irrelevant residues by two machine learning approaches, Deep Recursive Neural Network (DRNN) and Support Vector Machine (SVM) models with different training/testing schemes. The results showed that the DRNN model is superior to the SVM model when training is performed on the PMC-OA full-text articles and applied to classification (interface or non-interface) of the residues spotted in the PubMed abstracts. When both training and testing is performed on full-text articles or on abstracts, the performance of these models is similar. Thus, in such cases, there is no need to utilize computationally demanding DRNN approach, which is computationally expensive especially at the training stage. The reason is that SVM success is often determined by the similarity in data/text patterns in the training and the testing sets, whereas the sentence structures in the abstracts are, in general, different from those in the full text articles. AVAILABILITYAND IMPLEMENTATION: The code and the datasets generated in this study are available at https://gitlab.ku.edu/vakser-lab-public/text-mining/-/tree/2020-09-04. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Mineração de Dados , Aprendizado de Máquina , Proteínas , PubMed , Máquina de Vetores de Suporte

8.

Structural motifs in protein cores and at protein-protein interfaces are different.

Hadarovich, Anna; Chakravarty, Devlina; Tuzikov, Alexander V; Ben-Tal, Nir; Kundrotas, Petras J; Vakser, Ilya A.

Protein Sci ; 30(2): 381-390, 2021 02.

Artigo em Inglês | MEDLINE | ID: mdl-33166001

RESUMO

Structures of proteins and protein-protein complexes are determined by the same physical principles and thus share a number of similarities. At the same time, there could be differences because in order to function, proteins interact with other molecules, undergo conformations changes, and so forth, which might impose different restraints on the tertiary versus quaternary structures. This study focuses on structural properties of protein-protein interfaces in comparison with the protein core, based on the wealth of currently available structural data and new structure-based approaches. The results showed that physicochemical characteristics, such as amino acid composition, residue-residue contact preferences, and hydrophilicity/hydrophobicity distributions, are similar in protein core and protein-protein interfaces. On the other hand, characteristics that reflect the evolutionary pressure, such as structural composition and packing, are largely different. The results provide important insight into fundamental properties of protein structure and function. At the same time, the results contribute to better understanding of the ways to dock proteins. Recent progress in predicting structures of individual proteins follows the advancement of deep learning techniques and new approaches to residue coevolution data. Protein core could potentially provide large amounts of data for application of the deep learning to docking. However, our results showed that the core motifs are significantly different from those at protein-protein interfaces, and thus may not be directly useful for docking. At the same time, such difference may help to overcome a major obstacle in application of the coevolutionary data to docking-discrimination of the intramolecular information not directly relevant to docking.

Assuntos

Bases de Dados de Proteínas , Mapeamento de Interação de Proteínas , Proteínas/química , Alinhamento de Sequência , Software , Sequência de Aminoácidos , Proteínas/genética

9.

Dockground Tool for Development and Benchmarking of Protein Docking Procedures.

Kundrotas, Petras J; Kotthoff, Ian; Choi, Sherman W; Copeland, Matthew M; Vakser, Ilya A.

Methods Mol Biol ; 2165: 289-300, 2020.

Artigo em Inglês | MEDLINE | ID: mdl-32621232

RESUMO

Databases of protein-protein complexes are essential for the development of protein modeling/docking techniques. Such databases provide a knowledge base for docking algorithms, intermolecular potentials, search procedures, scoring functions, and refinement protocols. Development of docking techniques requires systematic validation of the modeling protocols on carefully curated benchmark sets of complexes. We present a description and a guide to the DOCKGROUND resource ( http://dockground.compbio.ku.edu ) for structural modeling of protein interactions. The resource integrates various datasets of protein complexes and other data for the development and testing of protein docking techniques. The sets include bound complexes, experimentally determined unbound, simulated unbound, model-model complexes, and docking decoys. The datasets are available to the user community through a Web interface.

Assuntos

Simulação de Acoplamento Molecular/métodos , Conformação Proteica , Software , Benchmarking , Simulação de Acoplamento Molecular/normas , Ligação Proteica

10.

Application of docking methodologies to modeled proteins.

Singh, Amar; Dauzhenka, Taras; Kundrotas, Petras J; Sternberg, Michael J E; Vakser, Ilya A.

Proteins ; 88(9): 1180-1188, 2020 09.

Artigo em Inglês | MEDLINE | ID: mdl-32170770

RESUMO

Protein docking is essential for structural characterization of protein interactions. Besides providing the structure of protein complexes, modeling of proteins and their complexes is important for understanding the fundamental principles and specific aspects of protein interactions. The accuracy of protein modeling, in general, is still less than that of the experimental approaches. Thus, it is important to investigate the applicability of docking techniques to modeled proteins. We present new comprehensive benchmark sets of protein models for the development and validation of protein docking, as well as a systematic assessment of free and template-based docking techniques on these sets. As opposed to previous studies, the benchmark sets reflect the real case modeling/docking scenario where the accuracy of the models is assessed by the modeling procedure, without reference to the native structure (which would be unknown in practical applications). We also expanded the analysis to include docking of protein pairs where proteins have different structural accuracy. The results show that, in general, the template-based docking is less sensitive to the structural inaccuracies of the models than the free docking. The near-native docking poses generated by the template-based approach, typically, also have higher ranks than those produces by the free docking (although the free docking is indispensable in modeling the multiplicity of protein interactions in a crowded cellular environment). The results show that docking techniques are applicable to protein models in a broad range of modeling accuracy. The study provides clear guidelines for practical applications of docking to protein models.

Assuntos

Benchmarking/estatística & dados numéricos , Simulação de Acoplamento Molecular , Proteínas/química , Software , Sequência de Aminoácidos , Sítios de Ligação , Bases de Dados de Proteínas , Ligação Proteica , Estrutura Secundária de Proteína

11.

How to choose templates for modeling of protein complexes: Insights from benchmarking template-based docking.

Chakravarty, Devlina; McElfresh, G W; Kundrotas, Petras J; Vakser, Ilya A.

Proteins ; 88(8): 1070-1081, 2020 08.

Artigo em Inglês | MEDLINE | ID: mdl-31994759

RESUMO

Comparative docking is based on experimentally determined structures of protein-protein complexes (templates), following the paradigm that proteins with similar sequences and/or structures form similar complexes. Modeling utilizing structure similarity of target monomers to template complexes significantly expands structural coverage of the interactome. Template-based docking by structure alignment can be performed for the entire structures or by aligning targets to the bound interfaces of the experimentally determined complexes. Systematic benchmarking of docking protocols based on full and interface structure alignment showed that both protocols perform similarly, with top 1 docking success rate 26%. However, in terms of the models' quality, the interface-based docking performed marginally better. The interface-based docking is preferable when one would suspect a significant conformational change in the full protein structure upon binding, for example, a rearrangement of the domains in multidomain proteins. Importantly, if the same structure is selected as the top template by both full and interface alignment, the docking success rate increases 2-fold for both top 1 and top 10 predictions. Matching structural annotations of the target and template proteins for template detection, as a computationally less expensive alternative to structural alignment, did not improve the docking performance. Sophisticated remote sequence homology detection added templates to the pool of those identified by structure-based alignment, suggesting that for practical docking, the combination of the structure alignment protocols and the remote sequence homology detection may be useful in order to avoid potential flaws in generation of the structural templates library.

Assuntos

Simulação de Acoplamento Molecular , Peptídeos/química , Proteínas/química , Software , Sequência de Aminoácidos , Animais , Benchmarking , Sítios de Ligação , Cães , Escherichia coli/química , Humanos , Ligantes , Peptídeos/metabolismo , Ligação Proteica , Conformação Proteica em alfa-Hélice , Conformação Proteica em Folha beta , Domínios e Motivos de Interação entre Proteínas , Mapeamento de Interação de Proteínas , Multimerização Proteica , Proteínas/metabolismo , Projetos de Pesquisa , Homologia Estrutural de Proteína , Termodinâmica

12.

Gene ontology improves template selection in comparative protein docking.

Hadarovich, Anna; Anishchenko, Ivan; Tuzikov, Alexander V; Kundrotas, Petras J; Vakser, Ilya A.

Proteins ; 87(3): 245-253, 2019 03.

Artigo em Inglês | MEDLINE | ID: mdl-30520123

RESUMO

Structural characterization of protein-protein interactions is essential for our ability to study life processes at the molecular level. Computational modeling of protein complexes (protein docking) is important as the source of their structure and as a way to understand the principles of protein interaction. Rapidly evolving comparative docking approaches utilize target/template similarity metrics, which are often based on the protein structure. Although the structural similarity, generally, yields good performance, other characteristics of the interacting proteins (eg, function, biological process, and localization) may improve the prediction quality, especially in the case of weak target/template structural similarity. For the ranking of a pool of models for each target, we tested scoring functions that quantify similarity of Gene Ontology (GO) terms assigned to target and template proteins in three ontology domains-biological process, molecular function, and cellular component (GO-score). The scoring functions were tested in docking of bound, unbound, and modeled proteins. The results indicate that the combined structural and GO-terms functions improve the scoring, especially in the twilight zone of structural similarity, typical for protein models of limited accuracy.

Assuntos

Biologia Computacional , Ontologia Genética , Conformação Proteica , Proteínas/genética , Sítios de Ligação/genética , Bases de Dados de Proteínas , Humanos , Modelos Moleculares , Simulação de Acoplamento Molecular , Ligação Proteica/genética , Mapeamento de Interação de Proteínas , Mapas de Interação de Proteínas/genética , Proteínas/química , Software , Homologia Estrutural de Proteína

13.

Computational Feasibility of an Exhaustive Search of Side-Chain Conformations in Protein-Protein Docking.

Dauzhenka, Taras; Kundrotas, Petras J; Vakser, Ilya A.

J Comput Chem ; 39(24): 2012-2021, 2018 09 15.

Artigo em Inglês | MEDLINE | ID: mdl-30226647

RESUMO

Protein-protein docking procedures typically perform the global scan of the proteins relative positions, followed by the local refinement of the putative matches. Because of the size of the search space, the global scan is usually implemented as rigid-body search, using computationally inexpensive intermolecular energy approximations. An adequate refinement has to take into account structural flexibility. Since the refinement performs conformational search of the interacting proteins, it is extremely computationally challenging, given the enormous amount of the internal degrees of freedom. Different approaches limit the search space by restricting the search to the side chains, rotameric states, coarse-grained structure representation, principal normal modes, and so on. Still, even with the approximations, the refinement presents an extreme computational challenge due to the very large number of the remaining degrees of freedom. Given the complexity of the search space, the advantage of the exhaustive search is obvious. The obstacle to such search is computational feasibility. However, the growing computational power of modern computers, especially due to the increasing utilization of Graphics Processing Unit (GPU) with large amount of specialized computing cores, extends the ranges of applicability of the brute-force search methods. This proof-of-concept study demonstrates computational feasibility of an exhaustive search of side-chain conformations in protein pocking. The procedure, implemented on the GPU architecture, was used to generate the optimal conformations in a large representative set of protein-protein complexes. © 2018 Wiley Periodicals, Inc.

Assuntos

Algoritmos , Biologia Computacional , Conformação Proteica , Proteínas/química , Estudos de Viabilidade , Ligação Proteica

14.

Contact Potential for Structure Prediction of Proteins and Protein Complexes from Potts Model.

Anishchenko, Ivan; Kundrotas, Petras J; Vakser, Ilya A.

Biophys J ; 115(5): 809-821, 2018 09 04.

Artigo em Inglês | MEDLINE | ID: mdl-30122295

RESUMO

The energy function is the key component of protein modeling methodology. This work presents a semianalytical approach to the development of contact potentials for protein structure modeling. Residue-residue and atom-atom contact energies were derived by maximizing the probability of observing native sequences in a nonredundant set of protein structures. The optimization task was formulated as an inverse statistical mechanics problem applied to the Potts model. Its solution by pseudolikelihood maximization provides consistent estimates of coupling constants at atomic and residue levels. The best performance was achieved when interacting atoms were grouped according to their physicochemical properties. For individual protein structures, the performance of the contact potentials in distinguishing near-native structures from the decoys is similar to the top-performing scoring functions. The potentials also yielded significant improvement in the protein docking success rates. The potentials recapitulated experimentally determined protein stability changes upon point mutations and protein-protein binding affinities. The approach offers a different perspective on knowledge-based potentials and may serve as the basis for their further development.

Assuntos

Modelos Moleculares , Proteínas/química , Proteínas/metabolismo , Funções Verossimilhança , Mutação Puntual , Conformação Proteica , Estabilidade Proteica , Proteínas/genética , Termodinâmica

15.

Inhibition of protein interactions: co-crystalized protein-protein interfaces are nearly as good as holo proteins in rigid-body ligand docking.

Belkin, Saveliy; Kundrotas, Petras J; Vakser, Ilya A.

J Comput Aided Mol Des ; 32(7): 769-779, 2018 07.

Artigo em Inglês | MEDLINE | ID: mdl-30003468

RESUMO

Modulating protein interaction pathways may lead to the cure of many diseases. Known protein-protein inhibitors bind to large pockets on the protein-protein interface. Such large pockets are detected also in the protein-protein complexes without known inhibitors, making such complexes potentially druggable. The inhibitor-binding site is primary defined by the side chains that form the largest pocket in the protein-bound conformation. Low-resolution ligand docking shows that the success rate for the protein-bound conformation is close to the one for the ligand-bound conformation, and significantly higher than for the apo conformation. The conformational change on the protein interface upon binding to the other protein results in a pocket employed by the ligand when it binds to that interface. This proof-of-concept study suggests that rather than using computational pocket-opening procedures, one can opt for an experimentally determined structure of the target co-crystallized protein-protein complex as a starting point for drug design.

Assuntos

Simulação de Acoplamento Molecular , Proteínas/antagonistas & inibidores , Proteínas/química , Sítios de Ligação , Cristalização , Bases de Dados de Proteínas , Desenho de Fármacos , Ligantes , Estudo de Prova de Conceito , Ligação Proteica , Conformação Proteica

16.

Natural language processing in text mining for structural modeling of protein complexes.

Badal, Varsha D; Kundrotas, Petras J; Vakser, Ilya A.

BMC Bioinformatics ; 19(1): 84, 2018 03 05.

Artigo em Inglês | MEDLINE | ID: mdl-29506465

RESUMO

BACKGROUND: Structural modeling of protein-protein interactions produces a large number of putative configurations of the protein complexes. Identification of the near-native models among them is a serious challenge. Publicly available results of biomedical research may provide constraints on the binding mode, which can be essential for the docking. Our text-mining (TM) tool, which extracts binding site residues from the PubMed abstracts, was successfully applied to protein docking (Badal et al., PLoS Comput Biol, 2015; 11: e1004630). Still, many extracted residues were not relevant to the docking. RESULTS: We present an extension of the TM tool, which utilizes natural language processing (NLP) for analyzing the context of the residue occurrence. The procedure was tested using generic and specialized dictionaries. The results showed that the keyword dictionaries designed for identification of protein interactions are not adequate for the TM prediction of the binding mode. However, our dictionary designed to distinguish keywords relevant to the protein binding sites led to considerable improvement in the TM performance. We investigated the utility of several methods of context analysis, based on dissection of the sentence parse trees. The machine learning-based NLP filtered the pool of the mined residues significantly more efficiently than the rule-based NLP. Constraints generated by NLP were tested in docking of unbound proteins from the DOCKGROUND X-ray benchmark set 4. The output of the global low-resolution docking scan was post-processed, separately, by constraints from the basic TM, constraints re-ranked by NLP, and the reference constraints. The quality of a match was assessed by the interface root-mean-square deviation. The results showed significant improvement of the docking output when using the constraints generated by the advanced TM with NLP. CONCLUSIONS: The basic TM procedure for extracting protein-protein binding site residues from the PubMed abstracts was significantly advanced by the deep parsing (NLP techniques for contextual analysis) in purging of the initial pool of the extracted residues. Benchmarking showed a substantial increase of the docking success rate based on the constraints generated by the advanced TM with NLP.

Assuntos

Mineração de Dados , Modelos Moleculares , Processamento de Linguagem Natural , Proteínas/química , Aprendizado de Máquina , Ligação Proteica , Semântica , Máquina de Vetores de Suporte

17.

Modeling CAPRI targets 110-120 by template-based and free docking using contact potential and combined scoring function.

Kundrotas, Petras J; Anishchenko, Ivan; Badal, Varsha D; Das, Madhurima; Dauzhenka, Taras; Vakser, Ilya A.

Proteins ; 86 Suppl 1: 302-310, 2018 03.

Artigo em Inglês | MEDLINE | ID: mdl-28905425

RESUMO

The paper presents analysis of our template-based and free docking predictions in the joint CASP12/CAPRI37 round. A new scoring function for template-based docking was developed, benchmarked on the Dockground resource, and applied to the targets. The results showed that the function successfully discriminates the incorrect docking predictions. In correctly predicted targets, the scoring function was complemented by other considerations, such as consistency of the oligomeric states among templates, similarity of the biological functions, biological interface relevance, etc. The scoring function still does not distinguish well biological from crystal packing interfaces, and needs further development for the docking of bundles of α-helices. In the case of the trimeric targets, sequence-based methods did not find common templates, despite similarity of the structures, suggesting complementary use of structure- and sequence-based alignments in comparative docking. The results showed that if a good docking template is found, an accurate model of the interface can be built even from largely inaccurate models of individual subunits. Free docking however is very sensitive to the quality of the individual models. However, our newly developed contact potential detected approximate locations of the binding sites.

Assuntos

Biologia Computacional/métodos , Bases de Dados de Proteínas , Modelos Moleculares , Conformação Proteica , Multimerização Proteica , Proteínas/química , Software , Humanos , Ligação Proteica , Análise de Sequência de Proteína

18.

Dockground: A comprehensive data resource for modeling of protein complexes.

Kundrotas, Petras J; Anishchenko, Ivan; Dauzhenka, Taras; Kotthoff, Ian; Mnevets, Daniil; Copeland, Matthew M; Vakser, Ilya A.

Protein Sci ; 27(1): 172-181, 2018 01.

Artigo em Inglês | MEDLINE | ID: mdl-28891124

RESUMO

Characterization of life processes at the molecular level requires structural details of protein interactions. The number of experimentally determined structures of protein-protein complexes accounts only for a fraction of known protein interactions. This gap in structural description of the interactome has to be bridged by modeling. An essential part of the development of structural modeling/docking techniques for protein interactions is databases of protein-protein complexes. They are necessary for studying protein interfaces, providing a knowledge base for docking algorithms, and developing intermolecular potentials, search procedures, and scoring functions. Development of protein-protein docking techniques requires thorough benchmarking of different parts of the docking protocols on carefully curated sets of protein-protein complexes. We present a comprehensive description of the Dockground resource (http://dockground.compbio.ku.edu) for structural modeling of protein interactions, including previously unpublished unbound docking benchmark set 4, and the X-ray docking decoy set 2. The resource offers a variety of interconnected datasets of protein-protein complexes and other data for the development and testing of different aspects of protein docking methodologies. Based on protein-protein complexes extracted from the PDB biounit files, Dockground offers sets of X-ray unbound, simulated unbound, model, and docking decoy structures. All datasets are freely available for download, as a whole or selecting specific structures, through a user-friendly interface on one integrated website.

Assuntos

Simulação de Acoplamento Molecular , Complexos Multiproteicos/química , Software

19.

Structural quality of unrefined models in protein docking.

Anishchenko, Ivan; Kundrotas, Petras J; Vakser, Ilya A.

Proteins ; 85(1): 39-45, 2017 01.

Artigo em Inglês | MEDLINE | ID: mdl-27756103

RESUMO

Structural characterization of protein-protein interactions is essential for understanding life processes at the molecular level. However, only a fraction of protein interactions have experimentally resolved structures. Thus, reliable computational methods for structural modeling of protein interactions (protein docking) are important for generating such structures and understanding the principles of protein recognition. Template-based docking techniques that utilize structural similarity between target protein-protein interaction and cocrystallized protein-protein complexes (templates) are gaining popularity due to generally higher reliability than that of the template-free docking. However, the template-based approach lacks explicit penalties for intermolecular penetration, as opposed to the typical free docking where such penalty is inherent due to the shape complementarity paradigm. Thus, template-based docking models are commonly assumed to require special treatment to remove large structural penetrations. In this study, we compared clashes in the template-based and free docking of the same proteins, with crystallographically determined and modeled structures. The results show that for the less accurate protein models, free docking produces fewer clashes than the template-based approach. However, contrary to the common expectation, in acceptable and better quality docking models of unbound crystallographically determined proteins, the clashes in the template-based docking are comparable to those in the free docking, due to the overall higher quality of the template-based docking predictions. This suggests that the free docking refinement protocols can in principle be applied to the template-based docking predictions as well. Proteins 2016; 85:39-45. © 2016 Wiley Periodicals, Inc.

Assuntos

Simulação de Acoplamento Molecular , Proteínas/química , Sítios de Ligação , Biologia Computacional/métodos , Cristalografia por Raios X , Ligação Proteica , Domínios e Motivos de Interação entre Proteínas , Mapeamento de Interação de Proteínas , Estrutura Secundária de Proteína , Software , Homologia Estrutural de Proteína

20.

Modeling complexes of modeled proteins.

Anishchenko, Ivan; Kundrotas, Petras J; Vakser, Ilya A.

Proteins ; 85(3): 470-478, 2017 03.

Artigo em Inglês | MEDLINE | ID: mdl-27701777

RESUMO

Structural characterization of proteins is essential for understanding life processes at the molecular level. However, only a fraction of known proteins have experimentally determined structures. This fraction is even smaller for protein-protein complexes. Thus, structural modeling of protein-protein interactions (docking) primarily has to rely on modeled structures of the individual proteins, which typically are less accurate than the experimentally determined ones. Such "double" modeling is the Grand Challenge of structural reconstruction of the interactome. Yet it remains so far largely untested in a systematic way. We present a comprehensive validation of template-based and free docking on a set of 165 complexes, where each protein model has six levels of structural accuracy, from 1 to 6 Å Cα RMSD. Many template-based docking predictions fall into acceptable quality category, according to the CAPRI criteria, even for highly inaccurate proteins (5-6 Å RMSD), although the number of such models (and, consequently, the docking success rate) drops significantly for models with RMSD > 4 Å. The results show that the existing docking methodologies can be successfully applied to protein models with a broad range of structural accuracy, and the template-based docking is much less sensitive to inaccuracies of protein models than the free docking. Proteins 2017; 85:470-478. © 2016 Wiley Periodicals, Inc.

Assuntos

Algoritmos , Biologia Computacional/métodos , Simulação de Acoplamento Molecular/métodos , Proteínas/química , Software , Motivos de Aminoácidos , Benchmarking , Sítios de Ligação , Cristalografia por Raios X , Ligação Proteica , Conformação Proteica , Projetos de Pesquisa , Termodinâmica

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA